genextreme (Generalized Extreme Value / GEV) distribution#

The generalized extreme value (GEV) distribution is the canonical model for block maxima: if you take maxima of large i.i.d. samples (annual maximum rainfall, yearly peak wind speed, worst daily loss in a month, …), then after an affine re-scaling the only possible non-degenerate limit laws are GEV distributions.

In SciPy, the GEV family is implemented as scipy.stats.genextreme (distribution name: genextreme).


Learning goals#

  • classify the distribution (support + parameter space) and understand SciPy’s sign convention

  • write down the PDF/CDF and the inverse CDF (quantile function)

  • derive mean/variance (and conditions for existence) using a clean exponential change-of-variables trick

  • implement NumPy-only PDF/CDF/PPF/sampling and validate against SciPy

  • fit a GEV by MLE, and use it in simple extreme-value workflows (return levels, tests)

import numpy as np

import plotly.graph_objects as go
import os
import plotly.io as pio
from plotly.subplots import make_subplots

from scipy import stats
from scipy.stats import genextreme as genextreme_sp
from scipy.special import gamma, zeta

pio.templates.default = "plotly_white"
pio.renderers.default = os.environ.get("PLOTLY_RENDERER", "notebook")

np.set_printoptions(precision=6, suppress=True)
rng = np.random.default_rng(7)

PI = np.pi
EULER_GAMMA = np.euler_gamma

1) Title & classification#

  • Name: genextreme (generalized extreme value / GEV)

  • Type: continuous distribution

EVT (textbook) parameterization#

A common parameterization in extreme value theory is:

  • location: \(\mu \in \mathbb{R}\)

  • scale: \(\sigma > 0\)

  • shape: \(\xi \in \mathbb{R}\)

with support

\[ 1 + \xi\,\frac{x-\mu}{\sigma} > 0. \]

Equivalently:

  • if \(\xi>0\): \(x > \mu - \sigma/\xi\) (heavy right tail)

  • if \(\xi=0\): \(x \in \mathbb{R}\) (Gumbel type)

  • if \(\xi<0\): \(x < \mu - \sigma/\xi\) (finite upper end-point)

SciPy parameterization (scipy.stats.genextreme)#

SciPy uses parameters (c, loc, scale) with scale > 0. The sign convention is:

\[ \xi = -c,\qquad \mu=\mathrm{loc},\qquad \sigma=\mathrm{scale}. \]

So the SciPy support condition is

\[ 1 - c\,\frac{x-\mathrm{loc}}{\mathrm{scale}} > 0. \]

That means:

  • c > 0 corresponds to \(\xi<0\) (bounded above)

  • c < 0 corresponds to \(\xi>0\) (heavy tail)

2) Intuition & motivation#

What it models#

GEV distributions arise as limits of normalized maxima. Let \(X_1,\dots,X_n\) be i.i.d. and define the block maximum \(M_n = \max_i X_i\). If there exist sequences \(a_n>0\) and \(b_n\in\mathbb{R}\) such that

\[ \frac{M_n - b_n}{a_n} \;\Rightarrow\; Z \qquad (n\to\infty), \]

and the limit \(Z\) is non-degenerate, then \(Z\) must be a GEV distribution. This is the Fisher–Tippett–Gnedenko theorem (the “CLT of extremes”).

Typical use cases#

  • Hydrology / climate: annual max rainfall, peak river discharge, max temperature

  • Engineering: max load per cycle, peak wind speed, material strength extremes

  • Finance / insurance: worst daily loss in a month, largest annual claim

  • Systems: peak latency, traffic spikes, queue maxima

Relations to other distributions#

The shape parameter splits the family into three classical types:

  • \(\xi=0\): Gumbel (light/exponential-type tail)

  • \(\xi>0\): Fréchet-type (heavy tail)

  • \(\xi<0\): Weibull-type (finite upper end-point)

A close cousin is the generalized Pareto distribution (GPD), which models exceedances over a high threshold (peaks-over-threshold). A common mental model:

  • use GEV for block maxima

  • use GPD for threshold exceedances

Both share the same tail-index information (the extreme value index).

def gev_support(mu=0.0, sigma=1.0, xi=0.0):
    """Return (lower, upper) support bounds for GEV(μ, σ, ξ)."""
    if not (sigma > 0):
        raise ValueError("sigma must be > 0")

    if np.isclose(xi, 0.0):
        return (-np.inf, np.inf)

    bound = mu - sigma / xi
    if xi > 0:
        return (bound, np.inf)
    return (-np.inf, bound)


def gev_cdf(x, mu=0.0, sigma=1.0, xi=0.0):
    """GEV CDF in EVT parameterization (NumPy-only)."""
    if not (sigma > 0):
        raise ValueError("sigma must be > 0")

    x = np.asarray(x, dtype=float)
    z = (x - mu) / sigma

    if np.isclose(xi, 0.0):
        return np.exp(-np.exp(-z))

    w = 1.0 + xi * z
    out = np.empty_like(z, dtype=float)
    mask = w > 0

    out[mask] = np.exp(-np.power(w[mask], -1.0 / xi))

    # Outside support the CDF is exactly 0 or 1 depending on the tail type.
    out[~mask] = 0.0 if xi > 0 else 1.0
    return out


def gev_pdf(x, mu=0.0, sigma=1.0, xi=0.0):
    """GEV PDF in EVT parameterization (NumPy-only)."""
    if not (sigma > 0):
        raise ValueError("sigma must be > 0")

    x = np.asarray(x, dtype=float)
    z = (x - mu) / sigma

    if np.isclose(xi, 0.0):
        t = np.exp(-z)
        return (t * np.exp(-t)) / sigma

    w = 1.0 + xi * z
    out = np.zeros_like(z, dtype=float)
    mask = w > 0

    t = np.power(w[mask], -1.0 / xi)
    out[mask] = (1.0 / sigma) * np.power(w[mask], -1.0 / xi - 1.0) * np.exp(-t)
    return out


def gev_logpdf(x, mu=0.0, sigma=1.0, xi=0.0):
    """GEV log-PDF in EVT parameterization (NumPy-only)."""
    if not (sigma > 0):
        raise ValueError("sigma must be > 0")

    x = np.asarray(x, dtype=float)
    z = (x - mu) / sigma

    if np.isclose(xi, 0.0):
        return -np.log(sigma) - z - np.exp(-z)

    w = 1.0 + xi * z
    out = np.full_like(z, -np.inf, dtype=float)
    mask = w > 0

    t = np.power(w[mask], -1.0 / xi)
    out[mask] = -np.log(sigma) - (1.0 / xi + 1.0) * np.log(w[mask]) - t
    return out


def gev_ppf(u, mu=0.0, sigma=1.0, xi=0.0):
    """GEV quantile function (inverse CDF) in EVT parameterization (NumPy-only).

    Uses an `expm1`-based expression for numerical stability when |xi| is small.
    """
    if not (sigma > 0):
        raise ValueError("sigma must be > 0")

    u = np.asarray(u, dtype=float)
    if np.any((u <= 0.0) | (u >= 1.0)):
        raise ValueError("u must be in (0, 1)")

    y = -np.log(u)
    if np.isclose(xi, 0.0):
        z = -np.log(y)
    else:
        z = np.expm1(-xi * np.log(y)) / xi

    return mu + sigma * z


def gev_rvs(size=None, mu=0.0, sigma=1.0, xi=0.0, rng=None):
    """Sample from GEV(μ, σ, ξ) via inverse transform (NumPy-only)."""
    if rng is None:
        rng = np.random.default_rng()
    u = rng.random(size=size)
    return gev_ppf(u, mu=mu, sigma=sigma, xi=xi)


def evt_to_scipy(mu=0.0, sigma=1.0, xi=0.0):
    """Convert EVT params (μ, σ, ξ) to SciPy params (c, loc, scale)."""
    return (-xi, mu, sigma)


def scipy_to_evt(c=0.0, loc=0.0, scale=1.0):
    """Convert SciPy params (c, loc, scale) to EVT params (μ, σ, ξ)."""
    return (loc, scale, -c)


def genextreme_pdf(x, c=0.0, loc=0.0, scale=1.0):
    """SciPy-compatible `genextreme` PDF (NumPy-only wrapper)."""
    mu, sigma, xi = scipy_to_evt(c=c, loc=loc, scale=scale)
    return gev_pdf(x, mu=mu, sigma=sigma, xi=xi)


def genextreme_cdf(x, c=0.0, loc=0.0, scale=1.0):
    """SciPy-compatible `genextreme` CDF (NumPy-only wrapper)."""
    mu, sigma, xi = scipy_to_evt(c=c, loc=loc, scale=scale)
    return gev_cdf(x, mu=mu, sigma=sigma, xi=xi)


def genextreme_ppf(u, c=0.0, loc=0.0, scale=1.0):
    """SciPy-compatible `genextreme` PPF (NumPy-only wrapper)."""
    mu, sigma, xi = scipy_to_evt(c=c, loc=loc, scale=scale)
    return gev_ppf(u, mu=mu, sigma=sigma, xi=xi)

3) Formal definition#

Let \(z = (x-\mu)/\sigma\).

EVT parameterization#

For \(\xi \ne 0\) and \(1+\xi z > 0\):

CDF

\[ F(x;\mu,\sigma,\xi) = \exp\left(-\bigl(1+\xi z\bigr)^{-1/\xi}\right). \]

PDF

\[ f(x;\mu,\sigma,\xi) = \frac{1}{\sigma}\,\bigl(1+\xi z\bigr)^{-1/\xi-1} \exp\left(-\bigl(1+\xi z\bigr)^{-1/\xi}\right). \]

For \(\xi = 0\) (the continuous limit, Gumbel type):

CDF

\[ F(x;\mu,\sigma,0) = \exp\left(-\exp(-z)\right). \]

PDF

\[ f(x;\mu,\sigma,0) = \frac{1}{\sigma}\,\exp\bigl(-z-\exp(-z)\bigr). \]

SciPy standard form#

SciPy’s standard genextreme has parameters (c, loc=0, scale=1). For \(c\ne 0\) and \(1-cx>0\):

\[ F(x;c)=\exp\bigl(-(1-cx)^{1/c}\bigr), \qquad f(x;c)=\exp\bigl(-(1-cx)^{1/c}\bigr)\,(1-cx)^{1/c-1}. \]

For \(c=0\) (Gumbel):

\[ F(x;0)=\exp\bigl(-e^{-x}\bigr), \qquad f(x;0)=\exp\bigl(-e^{-x}\bigr)\,e^{-x}. \]

Mapping between the two:

\[ \xi=-c,\quad \mu=\mathrm{loc},\quad \sigma=\mathrm{scale}. \]

(With loc/scale, replace \(x\) by \((x-\mathrm{loc})/\mathrm{scale}\) and multiply the PDF by \(1/\mathrm{scale}\).)

4) Moments & properties#

Existence of moments#

For \(\xi>0\) (heavy tail), moments may diverge. A useful rule is:

The \(k\)-th absolute moment exists iff \(\xi < 1/k\).

So:

  • mean exists iff \(\xi < 1\)

  • variance exists iff \(\xi < 1/2\)

  • skewness exists iff \(\xi < 1/3\)

  • kurtosis exists iff \(\xi < 1/4\)

For \(\xi<0\) (bounded above), all moments exist.

Mean and variance#

For \(\xi \ne 0\) and \(\xi<1\),

\[ \mathbb{E}[X] = \mu + \frac{\sigma}{\xi}\left(\Gamma(1-\xi)-1\right). \]

For \(\xi \ne 0\) and \(\xi<1/2\),

\[ \mathrm{Var}(X) = \frac{\sigma^2}{\xi^2}\left(\Gamma(1-2\xi) - \Gamma(1-\xi)^2\right). \]

In the Gumbel limit \(\xi\to 0\):

\[ \mathbb{E}[X]=\mu+\sigma\,\gamma,\qquad \mathrm{Var}(X)=\sigma^2\,\frac{\pi^2}{6}, \]

where \(\gamma\approx 0.57721\) is the Euler–Mascheroni constant.

Skewness and kurtosis#

Let \(g_k = \Gamma(1-k\xi)\) (defined when \(\xi<1/k\)). Define

\[ \mu_2 = g_2 - g_1^2,\quad \mu_3 = g_3 - 3g_2g_1 + 2g_1^3,\quad \mu_4 = g_4 - 4g_3g_1 + 6g_2g_1^2 - 3g_1^4. \]

Then

\[ \text{skewness} = \frac{\mu_3}{\mu_2^{3/2}},\qquad \text{kurtosis (excess)} = \frac{\mu_4}{\mu_2^2} - 3. \]

For \(\xi=0\) (Gumbel):

  • skewness \(= \dfrac{12\sqrt{6}\,\zeta(3)}{\pi^3} \approx 1.13955\)

  • excess kurtosis \(= \dfrac{12}{5} = 2.4\)

MGF / characteristic function#

  • For Gumbel (\(\xi=0\)), the MGF exists for \(t < 1/\sigma\): $\(M_X(t)=\mathbb{E}[e^{tX}] = \exp(\mu t)\,\Gamma(1-\sigma t).\)\( The characteristic function is \)\(\varphi_X(t)=\mathbb{E}[e^{itX}] = \exp(i\mu t)\,\Gamma(1-i\sigma t).\)$

  • For Fréchet-type (\(\xi>0\)), the MGF diverges for any \(t>0\) (too heavy a tail).

  • For general \(\xi\ne 0\), closed forms are not elementary; numerical integration / Monte Carlo is typical.

Entropy#

The differential entropy (in nats) has a simple closed form:

\[ H(X) = \log \sigma + (1+\xi)\,\gamma + 1. \]

Max-stability#

If \(X_1,\dots,X_n\) are i.i.d. GEV with shape \(\xi\), then \(\max_i X_i\) is also GEV with the same shape \(\xi\) (but different location/scale).

def gev_closed_form_stats(mu=0.0, sigma=1.0, xi=0.0):
    """Return (mean, var, skew, excess_kurt, entropy) when they exist.

    Notes
    - Mean exists iff xi < 1
    - Var exists iff xi < 1/2
    - Skew exists iff xi < 1/3
    - Excess kurtosis exists iff xi < 1/4
    - Entropy exists for all xi and sigma>0
    """
    if not (sigma > 0):
        raise ValueError("sigma must be > 0")

    entropy = np.log(sigma) + (1.0 + xi) * EULER_GAMMA + 1.0

    if np.isclose(xi, 0.0):
        mean = mu + sigma * EULER_GAMMA
        var = sigma**2 * (PI**2 / 6.0)
        skew = 12.0 * np.sqrt(6.0) * zeta(3.0) / (PI**3)
        excess_kurt = 12.0 / 5.0
        return float(mean), float(var), float(skew), float(excess_kurt), float(entropy)

    g1 = gamma(1.0 - xi)

    mean = np.inf if xi >= 1 else mu + (sigma / xi) * (g1 - 1.0)

    if xi >= 0.5:
        var = np.inf
        skew = np.nan
        excess_kurt = np.nan
        return float(mean), float(var), float(skew), float(excess_kurt), float(entropy)

    g2 = gamma(1.0 - 2.0 * xi)
    var = (sigma**2 / xi**2) * (g2 - g1**2)

    # Skewness depends only on xi (affine invariance)
    if xi >= (1.0 / 3.0):
        skew = np.nan
    else:
        g3 = gamma(1.0 - 3.0 * xi)
        mu2 = g2 - g1**2
        mu3 = g3 - 3.0 * g2 * g1 + 2.0 * g1**3
        skew = mu3 / (mu2 ** 1.5)

    if xi >= 0.25:
        excess_kurt = np.nan
    else:
        g3 = gamma(1.0 - 3.0 * xi)  # safe here because xi < 1/4
        g4 = gamma(1.0 - 4.0 * xi)
        mu2 = g2 - g1**2
        mu4 = g4 - 4.0 * g3 * g1 + 6.0 * g2 * g1**2 - 3.0 * g1**4
        excess_kurt = mu4 / (mu2**2) - 3.0

    return float(mean), float(var), float(skew), float(excess_kurt), float(entropy)


mu, sigma, xi = 0.3, 1.7, 0.2
c, loc, scale = evt_to_scipy(mu=mu, sigma=sigma, xi=xi)

mean_cf, var_cf, skew_cf, exkurt_cf, entr_cf = gev_closed_form_stats(mu=mu, sigma=sigma, xi=xi)

mean_sp, var_sp, skew_sp, exkurt_sp = genextreme_sp.stats(c, loc=loc, scale=scale, moments="mvsk")
entr_sp = genextreme_sp.entropy(c, loc=loc, scale=scale)

print("EVT params (mu, sigma, xi):", (mu, sigma, xi))
print("SciPy params (c, loc, scale):", (c, loc, scale))
print()
print("mean (closed form):", mean_cf)
print("mean (SciPy)      :", float(mean_sp))
print("var  (closed form):", var_cf)
print("var  (SciPy)      :", float(var_sp))
print("skew (closed form):", skew_cf)
print("skew (SciPy)      :", float(skew_sp))
print("excess kurt (closed):", exkurt_cf)
print("excess kurt (SciPy) :", float(exkurt_sp))
print("entropy (closed)    :", entr_cf)
print("entropy (SciPy)     :", float(entr_sp))
EVT params (mu, sigma, xi): (0.3, 1.7, 0.2)
SciPy params (c, loc, scale): (-0.2, 0.3, 1.7)

mean (closed form): 1.6959525666650759
mean (SciPy)      : 1.6959525666650779
var  (closed form): 9.664262775040937
var  (SciPy)      : 9.664262775040893
skew (closed form): 3.5350716046213435
skew (SciPy)      : 3.535071604621334
excess kurt (closed): 45.091512125815335
excess kurt (SciPy) : 45.091512125815335
entropy (closed)    : 2.2232870489440097
entropy (SciPy)     : 2.2232870489440097

5) Parameter interpretation#

SciPy exposes genextreme as a location–scale family with an additional shape parameter.

Location (loc / \(\mu\))#

Shifts the distribution left/right.

Scale (scale / \(\sigma\))#

Stretches the distribution: larger scale means more spread.

Shape (c / \(\xi\))#

The shape controls tail behavior:

  • \(\xi>0\) (SciPy: c<0): heavy tail; very large maxima are more plausible.

  • \(\xi=0\) (SciPy: c=0): Gumbel; light tail.

  • \(\xi<0\) (SciPy: c>0): bounded above at \(x_{\max}=\mu-\sigma/\xi\).

A practical interpretation uses return levels. If a block maximum is modeled as \(X\sim\mathrm{GEV}(\mu,\sigma,\xi)\), the \(T\)-block return level is

\[ q_T = F^{-1}\left(1-\frac{1}{T}\right). \]

If you have one block per year, \(q_{100}\) is the “100-year return level”.

mu, sigma = 0.0, 1.0
xis = [-0.3, 0.0, 0.2]

x = np.linspace(-6, 8, 1200)

fig = go.Figure()
for xi in xis:
    fig.add_trace(
        go.Scatter(
            x=x,
            y=gev_pdf(x, mu=mu, sigma=sigma, xi=xi),
            mode="lines",
            name=f"xi={xi}",
        )
    )

# Mark finite endpoint when xi < 0 and lower bound when xi > 0
for xi in xis:
    if xi < 0:
        _, upper = gev_support(mu=mu, sigma=sigma, xi=xi)
        fig.add_vline(x=upper, line_dash="dot", line_color="gray")
    if xi > 0:
        lower, _ = gev_support(mu=mu, sigma=sigma, xi=xi)
        fig.add_vline(x=lower, line_dash="dot", line_color="gray")

fig.update_layout(
    title="GEV PDFs for different shape parameters (mu=0, sigma=1)",
    xaxis_title="x",
    yaxis_title="density",
    width=900,
    height=420,
)
fig.show()

6) Derivations#

A key trick: turn the CDF into an exponential#

Start from the EVT CDF (for \(\xi\ne 0\)):

\[ F(x)=\exp\left(-\left(1+\xi\frac{x-\mu}{\sigma}\right)^{-1/\xi}\right). \]

Let \(U\sim\mathrm{Uniform}(0,1)\) and set \(Y=-\log U\). Then \(Y\sim\mathrm{Exp}(1)\). Solve \(U=F(X)\) for \(X\):

(3)#\[\begin{align} U &= \exp\left(-\left(1+\xi\frac{X-\mu}{\sigma}\right)^{-1/\xi}\right)\\ -\log U &= \left(1+\xi\frac{X-\mu}{\sigma}\right)^{-1/\xi}\\ Y &= \left(1+\xi\frac{X-\mu}{\sigma}\right)^{-1/\xi}\\ 1+\xi\frac{X-\mu}{\sigma} &= Y^{-\xi}\\ X &= \mu + \frac{\sigma}{\xi}\left(Y^{-\xi}-1\right). \end{align}\]

For the Gumbel case \(\xi=0\), the continuous limit gives

\[ X = \mu - \sigma\,\log Y. \]

This representation is useful for both sampling and moments.

Expectation and variance#

Using \(Y\sim\mathrm{Exp}(1)\) and \(X=\mu+\tfrac{\sigma}{\xi}(Y^{-\xi}-1)\):

(4)#\[\begin{align} \mathbb{E}[X] &= \mu + \frac{\sigma}{\xi}\left(\mathbb{E}[Y^{-\xi}] - 1\right). \end{align}\]

Because an exponential is a Gamma\((1,1)\) random variable,

\[ \mathbb{E}[Y^{-a}] = \int_0^\infty y^{-a}e^{-y}dy = \Gamma(1-a), \]

which is finite iff \(a<1\). Setting \(a=\xi\) yields the mean formula; setting \(a=2\xi\) yields the variance formula.

Likelihood#

For data \(x_1,\dots,x_n\) and \(z_i=(x_i-\mu)/\sigma\), the log-likelihood (for \(\xi\ne 0\)) is

\[ \ell(\mu,\sigma,\xi) = -n\log\sigma - \left(1+\frac{1}{\xi}\right)\sum_{i=1}^n \log\left(1+\xi z_i\right) - \sum_{i=1}^n \left(1+\xi z_i\right)^{-1/\xi}, \]

subject to the support constraints \(1+\xi z_i>0\) for all \(i\).

For \(\xi=0\) (Gumbel):

\[ \ell(\mu,\sigma,0) = -n\log\sigma - \sum_{i=1}^n \left(z_i + e^{-z_i}\right). \]

In practice, MLE for GEV is done numerically (SciPy’s fit does this).

def gev_loglikelihood(data, mu=0.0, sigma=1.0, xi=0.0):
    data = np.asarray(data, dtype=float)
    return float(np.sum(gev_logpdf(data, mu=mu, sigma=sigma, xi=xi)))


# Quick sanity check: log-likelihood at true params vs a perturbed set
mu_true, sigma_true, xi_true = 0.5, 1.2, 0.15
x = gev_rvs(size=2000, mu=mu_true, sigma=sigma_true, xi=xi_true, rng=rng)

ll_true = gev_loglikelihood(x, mu=mu_true, sigma=sigma_true, xi=xi_true)
ll_bad = gev_loglikelihood(x, mu=mu_true + 0.5, sigma=sigma_true * 1.4, xi=xi_true + 0.1)

print("loglik(true params):", ll_true)
print("loglik(perturbed)  :", ll_bad)
loglik(true params): -3657.8563555300343
loglik(perturbed)  : -3837.5357980189947

7) Sampling & simulation (NumPy only)#

Because the CDF is explicit, inverse-transform sampling is natural.

Algorithm#

To sample \(X\sim\mathrm{GEV}(\mu,\sigma,\xi)\):

  1. Draw \(U\sim\mathrm{Uniform}(0,1)\).

  2. Set \(Y=-\log U\) (so \(Y\sim\mathrm{Exp}(1)\)).

  3. Return

  • if \(\xi\ne 0\): $\(X = \mu + \frac{\sigma}{\xi}\left(Y^{-\xi}-1\right)\)$

  • if \(\xi=0\): $\(X = \mu - \sigma\log Y.\)$

This is exactly what gev_rvs / gev_ppf implement.

# Validate NumPy-only sampler against SciPy via quantiles
mu, sigma, xi = -0.2, 0.9, -0.25
c, loc, scale = evt_to_scipy(mu=mu, sigma=sigma, xi=xi)

n = 50_000
x_np = gev_rvs(size=n, mu=mu, sigma=sigma, xi=xi, rng=rng)
x_sp = genextreme_sp.rvs(c, loc=loc, scale=scale, size=n, random_state=rng)

qs = np.array([0.01, 0.05, 0.5, 0.95, 0.99])
q_np = np.quantile(x_np, qs)
q_sp = np.quantile(x_sp, qs)

print("quantiles:")
for q, a, b in zip(qs, q_np, q_sp):
    print(f"  q={q:>4.2f}  numpy={a:>10.6f}  scipy={b:>10.6f}  diff={a-b:+.3e}")
quantiles:
  q=0.01  numpy= -1.862651  scipy= -1.878569  diff=+1.592e-02
  q=0.05  numpy= -1.327733  scipy= -1.328652  diff=+9.193e-04
  q=0.50  numpy=  0.118156  scipy=  0.118494  diff=-3.384e-04
  q=0.95  numpy=  1.688802  scipy=  1.670319  diff=+1.848e-02
  q=0.99  numpy=  2.234961  scipy=  2.251763  diff=-1.680e-02

8) Visualization#

We’ll plot:

  • the PDF

  • the CDF

  • a Monte Carlo histogram with the theoretical PDF overlay

mu, sigma, xi = 0.0, 1.1, 0.12

# Build an x-grid that respects support
lower, upper = gev_support(mu=mu, sigma=sigma, xi=xi)
if np.isfinite(lower):
    x_min = lower + 1e-3 * sigma
else:
    x_min = mu - 6 * sigma

if np.isfinite(upper):
    x_max = upper - 1e-3 * sigma
else:
    x_max = mu + 10 * sigma

x_grid = np.linspace(x_min, x_max, 900)

pdf = gev_pdf(x_grid, mu=mu, sigma=sigma, xi=xi)
cdf = gev_cdf(x_grid, mu=mu, sigma=sigma, xi=xi)

# Monte Carlo
n = 12_000
samples = gev_rvs(size=n, mu=mu, sigma=sigma, xi=xi, rng=rng)

# Empirical CDF
xs = np.sort(samples)
ys = np.arange(1, n + 1) / n

fig = make_subplots(rows=1, cols=2, subplot_titles=["PDF", "CDF"])

fig.add_trace(
    go.Histogram(
        x=samples,
        nbinsx=60,
        histnorm="probability density",
        name="MC histogram",
        opacity=0.55,
    ),
    row=1,
    col=1,
)
fig.add_trace(go.Scatter(x=x_grid, y=pdf, mode="lines", name="theoretical PDF"), row=1, col=1)

fig.add_trace(go.Scatter(x=x_grid, y=cdf, mode="lines", name="theoretical CDF"), row=1, col=2)
fig.add_trace(go.Scatter(x=xs, y=ys, mode="lines", name="empirical CDF"), row=1, col=2)

fig.update_layout(
    title=f"GEV visualization (mu={mu}, sigma={sigma}, xi={xi})",
    width=1000,
    height=420,
)
fig.update_xaxes(title_text="x", row=1, col=1)
fig.update_xaxes(title_text="x", row=1, col=2)
fig.update_yaxes(title_text="density", row=1, col=1)
fig.update_yaxes(title_text="probability", row=1, col=2)
fig.show()

9) SciPy integration (scipy.stats.genextreme)#

Key methods:

  • genextreme.pdf(x, c, loc, scale) / logpdf

  • genextreme.cdf(x, c, loc, scale) / ppf

  • genextreme.rvs(c, loc, scale, size=..., random_state=...)

  • genextreme.fit(data) for MLE (returns (c_hat, loc_hat, scale_hat))

Remember SciPy’s sign convention: \(\xi=-c\).

# Compare NumPy-only implementation with SciPy on a grid
mu, sigma, xi = 0.7, 1.4, 0.05
c, loc, scale = evt_to_scipy(mu=mu, sigma=sigma, xi=xi)

x = np.linspace(mu - 6 * sigma, mu + 10 * sigma, 800)

pdf_np = gev_pdf(x, mu=mu, sigma=sigma, xi=xi)
pdf_sp = genextreme_sp.pdf(x, c, loc=loc, scale=scale)

cdf_np = gev_cdf(x, mu=mu, sigma=sigma, xi=xi)
cdf_sp = genextreme_sp.cdf(x, c, loc=loc, scale=scale)

print("max |pdf diff|:", float(np.max(np.abs(pdf_np - pdf_sp))))
print("max |cdf diff|:", float(np.max(np.abs(cdf_np - cdf_sp))))

# MLE fit demo
x_sim = genextreme_sp.rvs(c, loc=loc, scale=scale, size=3000, random_state=rng)
c_hat, loc_hat, scale_hat = genextreme_sp.fit(x_sim)

mu_hat, sigma_hat, xi_hat = scipy_to_evt(c=c_hat, loc=loc_hat, scale=scale_hat)

print()
print("True EVT params:", (mu, sigma, xi))
print("Fitted EVT params:", (mu_hat, sigma_hat, xi_hat))
max |pdf diff|: 2.7755575615628914e-16
max |cdf diff|: 7.771561172376096e-16

True EVT params: (0.7, 1.4, 0.05)
Fitted EVT params: (0.7178987556895622, 1.3894315601860372, 0.05133996828013741)

10) Statistical use cases#

Hypothesis testing#

A common question: is the shape parameter essentially zero?

  • \(H_0: \xi=0\) (Gumbel) vs \(H_1: \xi\ne 0\) (full GEV)

  • use a likelihood-ratio test (LRT) as a rough guide

In practice, extreme-value datasets are often small (one maximum per block), so asymptotic chi-square p-values can be shaky; bootstrap is common.

Bayesian modeling#

A Bayesian GEV model uses the same likelihood but places priors on parameters, e.g.

  • \(\mu\sim \mathcal{N}(m_0, s_0^2)\)

  • \(\log\sigma\sim \mathcal{N}(a_0, b_0^2)\) (enforces \(\sigma>0\))

  • \(\xi\sim \mathcal{N}(0, \tau^2)\), possibly truncated if you want to enforce existence of moments

Posterior inference can be done with MCMC (PyMC/Stan) and used to compute posterior distributions of return levels.

Generative modeling#

Once fit, GEV provides a simple generator for extreme scenarios:

  • simulate annual maxima under current assumptions

  • stress-test systems with extreme loads/latencies

  • generate synthetic extremes for downstream risk models

The key is that you’re not modeling the whole distribution—only the tail behavior of maxima.

# Likelihood-ratio test example: H0: xi=0 (Gumbel) vs H1: xi free
# (Illustrative; in practice, bootstrap is often preferred.)

def lr_test_xi_zero(data):
    data = np.asarray(data, dtype=float)

    # Unconstrained MLE (c, loc, scale)
    c1, loc1, scale1 = genextreme_sp.fit(data)
    ll1 = float(np.sum(genextreme_sp.logpdf(data, c1, loc=loc1, scale=scale1)))

    # Constrained MLE under H0: c=0 (=> xi=0)
    _, loc0, scale0 = genextreme_sp.fit(data, f0=0)
    ll0 = float(np.sum(genextreme_sp.logpdf(data, 0.0, loc=loc0, scale=scale0)))

    lr = 2.0 * (ll1 - ll0)
    p = float(stats.chi2.sf(lr, df=1))

    return {
        "ll_H1": ll1,
        "ll_H0": ll0,
        "LR": lr,
        "p_value": p,
        "H1_params": (c1, loc1, scale1),
        "H0_params": (0.0, loc0, scale0),
    }


# Simulate from a non-Gumbel GEV and test
mu_true, sigma_true, xi_true = 0.0, 1.0, 0.2
c_true, loc_true, scale_true = evt_to_scipy(mu=mu_true, sigma=sigma_true, xi=xi_true)

data = genextreme_sp.rvs(c_true, loc=loc_true, scale=scale_true, size=600, random_state=rng)
res = lr_test_xi_zero(data)

print("True xi:", xi_true)
print("LRT statistic:", res["LR"])
print("p-value:", res["p_value"])
print("H1 (c, loc, scale):", res["H1_params"])
print("H0 (c=0, loc, scale):", res["H0_params"])

# Return level example (T-block return level)
T = 100
c_hat, loc_hat, scale_hat = res["H1_params"]
q_T = float(genextreme_sp.ppf(1 - 1 / T, c_hat, loc=loc_hat, scale=scale_hat))
print()
print(f"Estimated {T}-block return level (from H1 fit): {q_T:.4f}")
True xi: 0.2
LRT statistic: 29.41424013354981
p-value: 5.8446596489538924e-08
H1 (c, loc, scale): (-0.15153848777245724, 0.04781441553700701, 1.0114994159620951)
H0 (c=0, loc, scale): (0.0, 0.13652098131156487, 1.0862344920342504)

Estimated 100-block return level (from H1 fit): 6.7756

11) Pitfalls#

  • Invalid parameters: scale / \(\sigma\) must be positive.

  • Support constraints: for \(\xi\ne 0\), you must have \(1+\xi(x-\mu)/\sigma>0\). During fitting, a single out-of-support point makes the likelihood zero.

  • SciPy sign convention: SciPy’s shape is c=-xi. Mixing them up flips “heavy tail” vs “bounded tail”.

  • Near \(\xi=0\) numerics: expressions like \((y^{-\xi}-1)/\xi\) can suffer cancellation when \(|\xi|\) is tiny. Use the \(\xi=0\) formulas (Gumbel) or stable alternatives like expm1.

  • Underflow/overflow: terms like \(\exp(-\exp(-z))\) or \(\exp(-w^{-1/\xi})\) can underflow for extreme \(z\). Work in log-space (logpdf) when possible.

  • Small-sample instability: with one maximum per block (e.g., per year), \(\xi\) estimates can have high variance. Bootstrap or Bayesian modeling helps quantify uncertainty.

  • Non-stationarity: trends/seasonality violate i.i.d. assumptions. Consider covariate-dependent GEV.

# A quick look at the xi -> 0 transition (numerical stability)
# Compare a naive ξ!=0 PPF formula against an expm1-stabilized version.

def gev_ppf_naive(u, mu=0.0, sigma=1.0, xi=0.0):
    u = np.asarray(u, dtype=float)
    y = -np.log(u)
    if xi == 0.0:
        z = -np.log(y)
    else:
        z = (np.power(y, -xi) - 1.0) / xi
    return mu + sigma * z


u = np.array([1e-6, 1e-3, 0.2, 0.8, 0.999999])
mu, sigma = 0.0, 1.0
x_gumbel = gev_ppf(u, mu=mu, sigma=sigma, xi=0.0)

for xi in [1e-8, -1e-8, 1e-5, -1e-5]:
    x_naive = gev_ppf_naive(u, mu=mu, sigma=sigma, xi=xi)
    x_stable = gev_ppf(u, mu=mu, sigma=sigma, xi=xi)

    err_naive = float(np.max(np.abs(x_naive - x_gumbel)))
    err_stable = float(np.max(np.abs(x_stable - x_gumbel)))

    print(f"xi={xi:+.0e}  max |naive - gumbel| = {err_naive:.3e}   max |stable - gumbel| = {err_stable:.3e}")
xi=+1e-08  max |naive - gumbel| = 9.439e-07   max |stable - gumbel| = 0.000e+00
xi=-1e-08  max |naive - gumbel| = 9.545e-07   max |stable - gumbel| = 0.000e+00
xi=+1e-05  max |naive - gumbel| = 9.544e-04   max |stable - gumbel| = 9.544e-04
xi=-1e-05  max |naive - gumbel| = 9.543e-04   max |stable - gumbel| = 9.543e-04

12) Summary#

  • genextreme is a continuous distribution used to model block maxima; it is the universal limit law for normalized maxima (Fisher–Tippett–Gnedenko).

  • EVT parameterization: \((\mu,\sigma,\xi)\) with support \(1+\xi(x-\mu)/\sigma>0\).

    • \(\xi>0\) heavy tail (Fréchet-type)

    • \(\xi=0\) Gumbel-type

    • \(\xi<0\) bounded above (Weibull-type)

  • Moments exist only for sufficiently small \(\xi\) (mean: \(\xi<1\), variance: \(\xi<1/2\), …).

  • Inverse CDF is explicit, enabling exact NumPy-only sampling.

  • SciPy uses scipy.stats.genextreme with shape c=-xi; fit performs MLE and you can use ppf for return levels.